Skip to content

feat: Durable Web Search golem:web-search API Across Different Providers in Rust #66

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 40 commits into from

Conversation

Rutik7066
Copy link
Contributor

Added Tavily, Serper, Brave & Google custom search.
Demo link : https://drive.google.com/file/d/16RxMlgtNzEp9Z1vYN-JZskMXBHARhuAQ/view?usp=sharing
Note: Microsoft Bing is not added. Microsoft will discontinue the bing api support after 11th August 2025
Ref : https://www.microsoft.com/en-us/bing/apis/bing-web-search-api

claim #34
/fixes #34

@Rutik7066
Copy link
Contributor Author

My last PR was closed at this link #46. I accidentally deleted the branch from GitHub, which caused the PR to close automatically. Fortunately, I had the branch saved locally, so I opened this new PR.


let page_size = self.params.max_results.unwrap_or(10);
let current_offset = *self.current_offset.borrow();
let new_offset = current_offset + page_size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the brave docs the offset should be incremented by 1 to get the next page of resuts, not page_size:
https://api-dashboard.search.brave.com/app/documentation/web-search/query#WebSearchAPIQueryParameters

"No more results available".to_string(),
));
}
*self.current_start_index.borrow_mut() = *self.current_start_index.borrow_mut() + 1_u32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mschuwalow
Copy link
Contributor

Requested access to the demo video, please grant access

@Rutik7066
Copy link
Contributor Author

@mschuwalow, granted access for the video. Please check, and I will update changes asap

@Rutik7066 Rutik7066 requested a review from mschuwalow July 17, 2025 20:27
@Rutik7066
Copy link
Contributor Author

Ohh, I understood now. That is why when we transition to live, it starts from the beginning.

@Rutik7066
Copy link
Contributor Author

@mschuwalow Please check i updated the changes
Screenshot 2025-07-20 235439

@Rutik7066
Copy link
Contributor Author

@mschuwalow Are we ready to merge?

println!("[DURABILITY] unwrapped_search_session: Creating new BraveSearchSession");
LOGGING_STATE.with_borrow_mut(|state| state.init());

with_config_key(&[Self::API_KEY_ENV_VAR], Err, |keys| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: please move reading the env var into the next_page call.

Otherwise a restored session might use a different api key then a live session (as they end up reading the environment variables at different times). This might make sense, but for now we want to avoid divergence between replay as live as much as possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will remove the env reading and eliminate that function. I'm going to take out unwrapped_search_session and move the env reading to next_page, which will introduce some overhead due to creating a new client and reading env each time. However, it's a necessary adjustment.

Comment on lines 69 to 90
let (results, metadata) =
convert_response_to_results(response, &self.params, Some(current_offset));

*self.last_metadata.borrow_mut() = metadata.clone();

if results.is_empty() {
*self.has_more_results.borrow_mut() = false;
return Err(SearchError::BackendError("No more results".to_string()));
}

if let Some(metadata) = &metadata {
let api_has_more = metadata.next_page_token.is_some();
let within_limits = new_offset < 9;
*self.has_more_results.borrow_mut() = api_has_more && within_limits;
} else {
*self.has_more_results.borrow_mut() = false;
}
*self.current_page.borrow_mut() += 1;
results
.into_iter()
.next()
.ok_or_else(|| SearchError::BackendError("No results returned".to_string()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic that you execute after the request is sent to update the internal state will need to be reproduced in session_for_page, as otherwise the internal state of the session will be different for live and replay (which will lead to different results).

I think your life will be easier if you move all checks / calculations into the beginning of next_page if possible. That way both replay and live should behave consistently. If this is not possible, you might need to persist additional state as part of replay so you can restore the internal session state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I have a solution for this. Since we are tracing the page_count in durable layer, then we can eliminate calculations from the next_page and create a separate method in extendedtrait to update pagination state with the tracked page count from durable. In the case of replay, we will increase page_count by one so that it will start exactly where the crash happened

@Rutik7066
Copy link
Contributor Author

Okay I will update asap

…hod across multiple web search components to streamline session management.
@Rutik7066
Copy link
Contributor Author

@mschuwalow updated the changes please take a look.

@Rutik7066
Copy link
Contributor Author

will complete exa asap

@Rutik7066
Copy link
Contributor Author

@mschuwalow Please check now i have addressed all the comments.

let state = self.state.borrow();
let result = match &*state {
Some(DurableSearchSessionState::Live { session, .. }) => session.get_metadata(),
_ => None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case needs to be handled (we just finished replay and the user asks us for the current metadata).

with_persistence_level(PersistenceLevel::PersistNothing, || {
session.next_page()
});
let persisted_result = durability.persist_infallible(NoInput, result.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should you use perist here instead of persist_infallible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mschuwalow I am trying to understand the difference between persist_infallible and persist, but I cannot find any documentation on that. Could you please explain?

@Rutik7066
Copy link
Contributor Author

For now, I am reverting EXA since it does not support pagination.

@Rutik7066
Copy link
Contributor Author

@mschuwalow Updated the changes. Please take a look.

@mschuwalow
Copy link
Contributor

I decided to go with #60, as it's closer to being merged.

Thank you for your work on this, I can tell you put a lot of work into it.

@mschuwalow mschuwalow closed this Jul 22, 2025
@Rutik7066
Copy link
Contributor Author

@mschuwalow, what are you still asking for changes on that PR, and closed mine. Did you look at the code at least after changes?

@Rutik7066
Copy link
Contributor Author

@mschuwalow Last time you did the same #2. My implementation was good from a maintenance perspective as well. But you still chose that PR.
you still can check and compare, which one is better and close.
cc @jdegoes @vigoo

@Rutik7066 Rutik7066 deleted the web-search branch July 23, 2025 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Durable Web Search golem:web-search API Across Different Providers in Rust
2 participants